hive:数据库“行专列”操作

您所在的位置:网站首页 hive string_agg hive:数据库“行专列”操作

hive:数据库“行专列”操作

#hive:数据库“行专列”操作| 来源: 网络整理| 查看: 265

hive:数据库“行专列”操作---使用collect_set/collect_list/collect_all & row_number()over(partition by 分组字段 [order by 排序字段]) 转载

mob60475707384d 2018-01-13 20:55:00

文章标签 Hadoop+Spark hive 字段 调优 mysql 文章分类 代码人生

方案一:请参考《数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])》,该方案是sqlserver,oracle,mysql,hive均适用的。

在hive中的方案分为以下两种方案:

创建测试表,并插入测试数据:

--hive 测试 行转列 collect_set collect_list create table tommyduan_test( gridid string, height int, cell string, mrcount int, weakmrcount int ); insert into tommyduan_test values('g1',1,'cell1',12,3); insert into tommyduan_test values('g1',1,'cell2',22,3); insert into tommyduan_test values('g1',1,'cell3',23,3); insert into tommyduan_test values('g1',1,'cell4',1,3); insert into tommyduan_test values('g1',1,'cell5',3,3); insert into tommyduan_test values('g1',1,'cell6',4,3); insert into tommyduan_test values('g1',1,'cell19',21,3); insert into tommyduan_test values('g2',1,'cell4',1,3); insert into tommyduan_test values('g2',1,'cell5',3,3); insert into tommyduan_test values('g2',1,'cell6',4,3); insert into tommyduan_test values('g2',1,'cell19',21,3); 方案二:使用collect_set方案

注意:collect_set是一个set集合,不允许重复的记录插入

select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray from ( select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn from tommyduan_test group by gridid,height,cell,mrcount,weakmrcount ) t10 where rn0 then cellArray[0] else '-9999' end) as cell1, (case when size(cellArray)>0 then mrcountArray[0] else '-9999' end) as cell1_mrcount, (case when size(cellArray)>0 then weakmrcountArray[0] else '-9999' end) as cell1_weakmrcount, (case when size(cellArray)>1 then cellArray[1] else '-9999' end) as cell2, (case when size(cellArray)>1 then mrcountArray[1] else '-9999' end) as cell2_mrcount, (case when size(cellArray)>1 then weakmrcountArray[1] else '-9999' end) as cell2_weakmrcount, (case when size(cellArray)>2 then cellArray[2] else '-9999' end) as cell3, (case when size(cellArray)>2 then mrcountArray[2] else '-9999' end) as cell3_mrcount, (case when size(cellArray)>2 then weakmrcountArray[2] else '-9999' end) as cell3_weakmrcount from ( select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray from ( select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn from tommyduan_test group by gridid,height,cell,mrcount,weakmrcount ) t10 where rn0 then split(cellArray[0],'_')[1] else '-9999' end) as cell1_mrcount, (case when size(cellArray)>0 then split(cellArray[0],'_')[2] else '-9999' end) as cell1_weakmrcount, (case when size(cellArray)>1 then split(cellArray[1],'_')[0] else '-9999' end) as cell2, (case when size(cellArray)>1 then split(cellArray[1],'_')[1] else '-9999' end) as cell2_mrcount, (case when size(cellArray)>1 then split(cellArray[1],'_')[2] else '-9999' end) as cell2_weakmrcount, (case when size(cellArray)>2 then split(cellArray[2],'_')[0] else '-9999' end) as cell3, (case when size(cellArray)>2 then split(cellArray[2],'_')[1] else '-9999' end) as cell3_mrcount, (case when size(cellArray)>2 then split(cellArray[2],'_')[2] else '-9999' end) as cell3_weakmrcount from ( select gridid,height,collect_set(concat_ws('_',cell,cast(mrcount as string), cast(weakmrcount as string))) as cellArray from ( select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn from tommyduan_test group by gridid,height,cell,mrcount,weakmrcount ) t10 where rn 本文章为转载内容,我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题,欢迎原作者联系我们进行内容更正或删除文章。 收藏 评论 分享 举报

上一篇:Hive:有表A与表B进行inner join,如果A分组内包含有数据,使用A,否则使用B分组下的数据 Hive&SqlServerql:inner join on条件中如果两边都是空值的情况下,关联结果中会把数据给过滤掉

下一篇:数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3